Statistical Science 

2012, Vol. 27, No. 3, 340-343 

DOI: 10.1214/12-STS381A 

Main article DOI: 10. 1214/11-STS381 

In the Public Domain 



o 

(N 

o 
O 

(N 



> 
in 
in 
\o 
o 

d 

(N 



X 



Discussion of "Multivariate Bayesian 
Logistic Regression for Analysis of Clinical 
Trial Safety Issues" by W. DuMouchel 

Bradley W. McEvoy and Ram C. Tiwari 

Key words and phrases: Meta-analysis, drug safety, hierarchical Bayesian 
model, data-mining, sparse data. 



We would like to comment on this article by Wil- 
liam DuMouchel, as it gives an interesting applica- 
tion of logistic regression to clinical safety data. Not 
to underscore the scope of the multivariate Bayesian 
logistic regression (MBLR) model, but the use of 
numerical integration is arguably its most impor- 
tant feature. Avoiding Markov chain Monte Carlo 
(MCMC) sampling techniques for other data-mining 
tools, such as the Multiple- item Gamma Poisson 
Shrinker (DuMouchel, 1999), has proven successful 
for Dr. DuMouchel in their acceptance among non- 
statisticians. With MBLR this should not be an ex- 
ception. 

As most statisticians lack the clinical insight re- 
quired to specify the appropriate MBLR model in- 
puts, it makes MBLR an ideal tool for use by the 
clinicians. However, targeted users may not appre- 
ciate some subtleties of MBLR, which we present 
below. We also present findings from our empirical 
evaluation of the MBLR algorithm. This commen- 
tary provides some perspective that we have gained 
through multiple interactions with Dr. DuMouchel 
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and from our reviews of different versions of MBLR 
formulation at FDA since 2009. 

1. MBLR AND META-ANALYSIS 

In order to fully appreciate the MBLR methodol- 
ogy, one has to contrast it with a more traditional 
meta-analytical formulation when data from multi- 
ple trials are investigated. Dr. DuMouchel is cor- 
rect in pointing out that the MBLR methodology is 
in the spirit of a full-data meta-analysis and does 
not consider it a meta-analytic model. The current 
MBLR model formulation does not render the flexi- 
bility of separating out patient- and trial-level varia- 
tions in the model. Consequently, MBLR is very dif- 
ferent from a multi-level/meta-analysis model that 
would consist of a patient-level model and a trial- 
level model, each with independent sources of vari- 
ation. This makes MBLR effectively a patient-level 
model; the inclusion of trial- level variables (e.g., study 
identifiers) into equation (2) results in the variance 
components in equations (3)-(6) being influenced by 
both patient and trial heterogeneity. 

This distinction between the MBLR and its meta- 
analytic formulation is critically important. The 
main advantage of a meta-analytic formulation is 
that it preserves the trial-specific randomized com- 
parison between the treatment and control groups, 
thereby avoiding confounded estimates. With the 
MBLR formulation this is not necessarily the case, 
as Dr. DuMouchel aptly notes for the Pollakiuria 
example that the trial-specific estimates do not pre- 
serve the between-trial differences. Additionally, 
shrinkage estimates used to identify vulnerable pa- 
tient subgroups depend on factors which are typi- 
cally considered unrelated of patient characteristics. 

The practical concern of applying a methodology 
that does not ensure the randomized comparison is 
preserved is that it may lead to a possible signal 
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being missed or hidden. A recent high-profile exam- 
ple of this concern was the meta-analysis of the di- 
abetes drug rosiglitazone (Rucker and Schumacher, 
2008). When safety data collected from the random- 
ized controlled trials were pooled by trial arm, it 
resulted in Simpson's paradox. 

It is, therefore, important to understand the sub- 
tle distinction of how MBLR differs from the more 
traditional meta-analytic models, and the potential 
consequences that may arise from the use of MBLR. 
Unfortunately, the MBLR tool/program in its cur- 
rent capacity does not have the capability to evalu- 
ate the potential implications discussed in the afore- 
mentioned paragraphs. This necessitates the use of 
other statistical methodologies to fully evaluate the 
results from MBLR software, which, paradoxically, 
is the situation that Dr. DuMouchel initially set out 
to avoid. That said, it would be a nice extension if 
the MBLR methodology was expanded, incorporat- 
ing the suggestions outlined above, thereby increas- 
ing the general utility of the tool. Next, we present 
an attempt toward this extension. 

2. META-ANALYTIC MBLR FORMULATION 

We present a modified MBLR model motivated 
from a meta-analytic perspective, which we shall, 
henceforth, refer to as meta-analytic MBLR (MA- 
MBLR). Using the notation from the paper, let the J 
covariates correspond only to patient-level charac- 
teristics and assume that there are a total of L trials. 
Then, the MA-MBLR patient-level model for trial I, 
I = 1, . . . , L, and issue k is given by 

\og\t{p ik i) = a ok i + y^ j X ig ia gk 
g 

+ Tu ( Poh + ^2 X ig i(3 gk J . 
^ g ' 

Unlike the MBLR formulation, the MA-MBLR would 
assume the trial-specific intercept ctQ k i and treat- 
ment effect /3ofcz have distinct variance components, 
thereby separating patient and trial variability. This 
can be formally achieved by assigning the trial-speci- 
fic intercept and treatment effect of the following hi- 
erarchical prior: a ok i~N(a ok ,a\ k ) and /3 fcz~iV(/3ofc, 
a% k ), for k = \,...,K and l = l,...,L. The MA- 
MBLR model is fully specified by equations (3)-(6), 
as well as by the hyperpriors for the model's hy- 
perparameters, and has the (2K + 4) standard de- 
viations, (cxa.i,. • • ,cta.k,&o.i, ■ ■ ■ ,cr .K,^A,cr ,aB,T), 
that have independent uniform distribution on the 
interval to d, as specified in the paper. 



We investigated for the data-example in the paper 
whether the MBLR and MA-MBLR formulations 
make a substantive impact on the risk assessment for 
the five most frequent issues. Both the MBLR and 
MA-MBLR models were fit using OpenBUGS (Lunn 
et al., 2009), and thus are fully Bayesian MBLR and 
MA-MBLR. The fully Bayesian models differed from 
the MBLR model described in the paper in three 
ways, namely, (i) it assumes diffused normal pri- 
ors for the location parameters rather than uniform 
noninformative priors, (ii) it constrains the hyper- 
priors A g such that the <?jth level of covariate j is 
equal to the negative sum of the remaining gj — 1 
levels, and (hi) the support of the prior for the stan- 
dard deviation d was increased to 3. 

Figure 1 shows the relationship for some of the 
estimated parameters. The issue specific treatment 
effect /3ofc did not differ too much between mod- 
els. However, the interaction term between treat- 
ment and the patient-level covariates tended to be 
closer to the null value for MA-MBLR, while the 
MA-MBLR trial-specific treatment effect tended to 
be further away from the null value than MBLR. 
Although there were no surprising differences noted 
between the MBLR and MA-MBLR coefficients for 
this example, the two different formulations can pos- 
sibly result in different substantive conclusions. 

3. BORROWING INFORMATION ACROSS 
ISSUES 

It is important to note that MBLR borrows in- 
formation across issues by positing a hierarchical 
distribution to parameters from parallel logistic re- 
gression models, and does not model the joint dis- 
tribution of the endpoints. An example of the lat- 
ter approach is given by Bayesian multivariate lo- 
gistic regression (O'Brien and Dunson, 2004). More 
importantly, there needs to be recognition among 
its users that an analysis that borrows information 
across issues is not inherently better than the one 
that does not. 

To illustrate a possible peril of borrowing infor- 
mation across issues, suppose the issues selected are 
medically related, but they vary in their severity; 
in particular, assume there is one severe issue that 
occurs infrequently and the remaining issues are less 
severe but occur more frequently. Because the amount 
of information borrowed across issues from MBLR 
is related to the precision of the estimate (which is 
a function of the issue frequency) , the effect for the 
less frequent issues would be sensitive to the effects 
for the more frequent issues. It is important that 
users of the tool are mindful of such considerations. 



COMMENTS ON MBLR 
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Fig. 1. Relationship between MA-MBLR and MBLR for selected model parameters. 



4. MBLR ESTIMATION ALGORITHM 

As stated previously, we believe the advantage 
of the MBLR methodology is in obtaining poste- 
rior inferences that do not rely on computation- 
ally time-consuming estimation methods (such as 
MCMC methods). However, the timeliness of the 
analysis has to be balanced by the well-known limi- 
tations of the Laplace approximation of the integral 
of the posterior density (Carlin and Louis, 2009), 
which are applicable to MBLR. 

As part of the software review at FDA, we evalu- 
ated the adequacy of MBLR's estimation algorithm 
by contrasting results obtained from the fully Bayes- 



ian MBLR using OpenBUGS; the comparison was 
based on the data described in the paper. The fully 
Bayesian MBLR differed from the MBLR by points (i) 
and (ii) listed above. The two estimation approaches 
yielded similar estimates for the variance compo- 
nents ip = {a a, &q,ctb,t) and the parameter estimates 
had almost perfect correlation (p = 0.9998). How- 
ever, the relationship based on z-scores (=estimate/ 
standard error), presented in Figure 2, suggests that 
MBLR has smaller standard errors than the full 
Bayesian analysis. This observation is also supported 
by the simulation results, where MBLR tended to 
have a type-I error rate that slightly exceeded the 
nominal 10% level. 




z( MBLR ) 

Fig. 2. Relationship of z -scores from fully Bayesian model fit using OpenBUGS compared to MBLR. 
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5. CONCLUSION 

The MBLR model will have a profound impact 
as it is rolled-out being used for clinical safety data 
analysis. However, in order to realize MBLR's po- 
tential strengths and pitfalls, it will require collabo- 
ration between its different user-constituents, those 
being statisticians and subject-matter experts. 
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